*Explore missing values

*Tim Goedemé 28/10/2020

/*

Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 
This file can be changed and re-shared for non-commercial use, as long as our original work 
is recognised and the revised work is made available under the same conditions.

When using this do-file, please cite as:
Goedemé, T., Nolan, B., Paskov, M., & Weisstanner, D. (2021). 
Occupational Social Class and Earnings Inequality in Europe: A Comparative Assessment. 
In: Social Indicators Research. DOI: https://doi.org/10.1007/s11205-021-02746-z; https://timgoedeme.com/tools/esec-in-eu-silc/


*/

global place1 <<data directory>>
global place2 <<directory for storing regression outputs>>
global place3 <<directory for storing other results>>
global countries AT BE BG CH CY CZ DE DK EE EL ES FI FR HR HU IE IT LT LU LV MT NL NO PL PT RO RS SE SI UK



*1. Look at gaps in the data: which interaction terms cannot be estimated?
**************************************************************************
*=> in which cases would it make sense to merge or omit social classes and in which case would it preferable to drop a control variable?

*Version 1

cap postclose Caverages
postfile Caverages str40 variable str5 country class1 class2 class3 class4 class5 class6 class7 class8 class9 total using "${place3}\Caverages1.dta", replace
foreach ctry of global countries {
	di "`ctry'", _continue
	cap mat drop results	
	quietly {
		local year 2018
		cap use "${place1}\`ctry'\\`year'\c`ctry'`year'_addvars1.dta", clear
		if _rc==0 {
			drop age hydisp eqs hystd thresh60 arop60 actage active
			svyset psu1 [pw=weight], strata(strata1)
			
			replace sub = 0 if missers!=0
			replace sub = 0 if earns1<=0 | earns2<=0
			
			gen class=esec08
			
			ta education, gen(educations)
			global educ
			local rows = r(r)
			forvalues r=1/`rows' {
				global educ ${educ} educations`r'
			}
			
			ta sector, gen(sectors)
			global sector
			local rows = r(r)
			forvalues r=1/`rows' {
				global sector ${sector} sectors`r'
			}
			
			global vars
			global vars fyfte sex immigrant ${educ} disabled health ${sector} temporary career single oneearner singleparent breadwinner othernochild otherwchild
			local counter
			foreach v of global vars {
				local counter= `counter' +1
				noi di `counter', _continue
				forvalues c=1/9 {
					local c`c'=.
					sum `v' [iw=weight] if esec08==`c' & sub==1
					local c`c'=r(mean)
				
				}
				sum `v' [iw=weight] if sub==1
				local c10=r(mean)
				post Caverages ("`v'") ("`ctry'") (`c1') (`c2') (`c3') (`c4') (`c5') (`c6') (`c7') (`c8') (`c9') (`c10')
			}
			
			noi di "."
		}
	}
}
postclose Caverages

*Analyse file

use "${place3}\Caverages1.dta", clear
egen min = rowmin(class1 class2 class3 class4 class5 class6 class7 class8 class9 total)
ta variable if min==0

egen min2 = rowmin(class1 class2 class3 class4 class6 class7 class8 class9 total)
ta variable if min2==0

gen min3 = min
replace min3 = min2 if regexm("DK DE NL MT EE BE CZ NO SE LU CH PT SI CY UK FR IT", country)

cap drop min4
egen min4 = rowmin(class1 class2 class3 class4 class5 class6 class7 class8 class9 total)
replace min4=. if regexm("temporary disabled sectors1 sectors2 sectors3 sectors4 sectors4 sectors5 sectors6 sectors7 sectors8 sectors9 sectors10 sectors11 sectors12 sectors13", variable)

cap drop min5
egen min5 = rowmin(class1 class2 class3 class4 class6 class7 class8 class9 total)
replace min5=. if regexm("temporary disabled sectors1 sectors2 sectors3 sectors4 sectors4 sectors5 sectors6 sectors7 sectors8 sectors9 sectors10 sectors11 sectors12 sectors13", variable)
replace min4=min5 if regexm("DK DE NL MT EE BE CZ NO SE LU CH PT SI CY UK FR IT", country)

ta variable if min4==0

*Version 2: drop temporary (not applicable to self employed), disability (too small sample) and economic sector (as somtimes strict overlap with social class in case of 9 classes

cap postclose Caverages
postfile Caverages str40 variable str5 country class1 class2 class3 class4 class5 class6 class7 class8 class9 total using "${place3}\Caverages2.dta", replace
foreach ctry of global countries {
	di "`ctry'", _continue
	cap mat drop results	
	quietly {
		local year 2018
		cap use "${place1}\`ctry'\\`year'\c`ctry'`year'_addvars1.dta", clear
		if _rc==0 {
			drop age hydisp eqs hystd thresh60 arop60 actage active
			svyset psu1 [pw=weight], strata(strata1)
			
			replace sub = 0 if missers!=0
			replace sub = 0 if earns1<=0 | earns2<=0
			
			gen class=esec08
			
			ta education, gen(educations)
			global educ
			local rows = r(r)
			forvalues r=1/`rows' {
				global educ ${educ} educations`r'
			}
			
			global vars
			global vars fyfte sex immigrant ${educ} health career nchilds nadults ndepadults
			local counter
			foreach v of global vars {
				local counter= `counter' +1
				noi di `counter', _continue
				forvalues c=1/9 {
					local c`c'=.
					sum `v' [iw=weight] if esec08==`c' & sub==1
					local c`c'=r(mean)
				}
				if regexm("DK DE NL MT EE BE CZ NO SE LU CH PT SI CY UK FR IT", "`ctry'") local c5=100
				sum `v' [iw=weight] if sub==1
				local c10=r(mean)
				post Caverages ("`v'") ("`ctry'") (`c1') (`c2') (`c3') (`c4') (`c5') (`c6') (`c7') (`c8') (`c9') (`c10')
			}
			
			noi di "."
		}
	}
}
postclose Caverages

*Analyse file

use "${place3}\Caverages2.dta", clear
egen min = rowmin(class1 class2 class3 class4 class5 class6 class7 class8 class9 total)
ta variable if min==0

